University of Ottawa's Participation in the CL-SR Task at CLEF 2006

نویسندگان

  • Muath Alzghool
  • Diana Inkpen
چکیده

This paper presents the second participation of the University of Ottawa group in CLEF, the CrossLanguage Spoken Retrieval (CL-SR) task. We present the results of the submitted runs for the English collection and very briefly for the Czech collection, followed by many additional experiments. We have used two Information Retrieval systems in our experiments: SMART and Terrier were tested with many different weighting schemes for indexing the documents and the queries and with several query expansion techniques (including a new method based on log-likelihood scores for collocations). Our experiments showed that query expansion methods do not help much for this collection. We tested whether the new Automatic Speech Recognition transcripts improve the retrieval results; we also tested combinations of different automatic transcripts (with different estimated word error rates). The retrieval results did not improve, probably because the speech recognition errors happened for the words that are important in retrieval, even in the newer ASR2006 transcripts. By using different system settings, we improved on our submitted result for the required run (English queries, title and description) on automatic transcripts plus automatic keywords. We present crosslanguage experiments, where the queries are automatically translated by combining the results of several online machine translation tools. Our experiments showed that high quality automatic translations (for French) led to results comparable with monolingual English, while the performance decreased for the other languages. Experiments on indexing the manual summaries and keywords gave the best retrieval results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

University of Ottawa's Contribution to CLEF 2005, the CL-SR Track

We present the participation of the University of Ottawa in the Cross-Language Spoken Document Retrieval task at CLEF 2005. In order to translate the queries, we combined the results of several online Machine Translation tools. For the Information Retrieval component we used the SMART system, with several weighting schemes for indexing the documents and the queries. One scheme in particular lea...

متن کامل

Benefit of Proper Language Processing for Czech Speech Retrieval in the CL-SR Task at CLEF 2006

The paper describes the system built by the team from the University of West Bohemia for participation in the CLEF 2006 CL-SR track. We have decided to concentrate only on the monolingual searching in the Czech test collection and investigate the effect of proper language processing on the retrieval performance. We have employed the Czech morphological analyser and tagger for that purposes. For...

متن کامل

The University of West Bohemia at CLEF 2006, the CL-SR Track

The paper describes the system build by the team from the University of West Bohemia for participation in the CLEF 2006 CL-SR track. We have decided to concentrate only on the monolingual searching in the Czech test collection. We have employed the Czech morphological analyser and tagger in order to perform necessary linguistic preprocessing (lemmatization and stop-word removal). As for the act...

متن کامل

Benefit of Proper Language Processing for Czech Speech Retrieval in the CL-SR Task at CLEF

The paper describes the system built by the team from the University of West Bohemia for participation in the CLEF 2006 CL-SR track. We have decided to concentrate only on the monolingual searching in the Czech test collection and investigate the effect of proper language processing on the retrieval performance. We have employed the Czech morphological analyser and tagger for that purposes. For...

متن کامل

Dublin City University at CLEF 2006: Cross-Language Speech Retrieval (CL-SR) Experiments

The Dublin City University participation in the CLEF 2006 CL-SR task concentrated on exploring the combination of the multiple fields associated with the documents. This was based on use of the extended BM25F field combination model originally developed for multi-field text documents. Additionally, we again conducted runs with our existing information retrieval methods based on the Okapi model....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006